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Abstract 

We give a development of the ODE method for the analysis of 
recursive algorithms described by a stochastic recursion. With vari- 
ability modelled via an underlying Markov process, and under general 
assumptions, the following results are obtained: 

(i) Stability of an associated ODE implies that the stochastic recursion 

is stable in a strong sense when a gain parameter is small. 

(ii) The range of gain- values is quantified through a spectral analysis 

of an associated linear operator, providing a non-local theory. 

(iii) A second-order analysis shows precisely how variability leads to 
sensitivity of the algorithm with respect to the gain parameter. 

All results are obtained within the natural operator-theoretic frame- 
work of geometrically ergodic Markov processes. 

1 Introduction 

Stochastic approximation algorithms and their variants are commonly found 
in control, communication and related fields. Popularity has grown due to 
increased computing power, and the interest in various 'machine learning' 
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algorithms [5, 6, 11]. When the algorithm is linear, then the error equations 
take the following linear recursive form: 

Xt+i = [I-aMt]Xt + Wt+u (1) 

where X = {Xt} is an error sequence, M = {Mt} is a sequence of k x k 
random matrices, W = {Wt} is a "disturbance", and / is the k x k identity 
matrix. 

An important example is the LMS (least mean square) algorithm. Con- 
sider the discrete linear time-varying model: 

y{t) = 9{tf(l>{t)+n{t), t>0 (2) 

where y(t) and n(t) are the sequences of (scalar) observations and noise, re- 
spectively, and e{t) = [ei{t),e2{t), ek{t)f and (^(i) = [<^i(i), . . . , (t)k{t)f 
denote the /c-dimensional regression vector and time varying parameters, 
respectively. The LMS algorithm is given by the recursion 

e{t+l) = §it)+a(f){t)e{t), (3) 

where e{t) = y{t) — 9{t)'^ (f){t), and the parameter a G (0, 1] is the step size. 
Hence, 

e{t + 1) = (I - a4>{t)(l){tf)e{t) + [9it + 1) - e{t) - a(P{t)n{t)] , (4) 

where e{t) = e{t) - e{t). This is of the form (1) with Mt = (f){t)(j){t)'^ , 
Wt+i = 9(t + 1) - eit) - a^(t)n{t), and Xt = e{t). 
On iterating (1) we obtain the representation, 

Xt+i = {I-aMt)Xt + Wt+i 

= {I-aMt)[iI-aMt-i)Xt-i + Wt] + Wt+i (5) 

1 

= JJ(7 - aMi)Xo + J](/ - aMi)Wi + ■■■ + {!- aMt)Wt + Wt+i- 

i=t i=t 

Prom the last expression it is clear that the matrix products ni=t(-^ ~ a-^i) 
play an important role in the behavior of (1). 

Properties of products of random matrices are of interest in a wide range 
of fields. Application areas include numerical analysis [12, 30], statistical 
physics [8, 9], recursive algorithms [10, 23], perturbation theory for dynam- 
ical systems [1], queueing theory [19], and even botany [26]. Seminal results 
are contained in [3, 25, 24]. 
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A complementary and popular research area concerns the eigenstructure 
of large random matrices (see e.g. [29, 13] for recent application to capacity 
of communication channels). Although the results of the present paper do 
not address these issues, they provide justification for simplified models in 
communication theory, leading to bounds on the capacity for time-varying 
communication channels [20]. 

The relationship with dynamical systems theory is particularly relevant 
to the issues addressed here. Consider a nonlinear dynamical system de- 
scribed by the equations, 

Xt+, =Xt- afiXt, + Wt+i , (6) 

where $ = {$4} is an ergodic Markov process, evolving on a state space 
X, and /: M*^ X X — >■ M'^ is smooth and Lipschitz continuous. Although it 
is, of course, impossible to iterate a nonlinear model of this general form, 
we can construct a random linear model to address many interesting issues. 
Viewing the initial condition 7 = Xq € M'^ as a continuous variable, we write 
Xt{'y) as the resulting state trajectory and consider the sensitivity matrix, 

St = — Xt(7), t>0. 

From (6) we have the linear recursion, 

St+i = [I- aMt+i]St, (7) 

where Mt+i = Vxf {Xt, ^*t+i), t > 0. If S = {St} is suitably stable then the 
same is true for the nonlinear model, and we find that trajectories couple to 
a steady state process X* = {X^}: 

hm \\Xti^)-x;\\ = o. 

c— >oo 

These ideas are related to issues developed in Section 3. 

The traditional analytic technique for addressing the stability of (6) or 
of (1) is the ODE method of [18]. For linear models, the basic idea is that, 
for small values of a, the behavior of (1) should mimic that of the linear 
ODE, 

^^t = -aMjt + W, (8) 

where M and W are means of Mt and Wt, respectively. To obtain a finer 
performance analysis one can instead compare (1) to the linear diffusion 
model, 

dVt = -oMFt + dBt, (9) 
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where B = {Bt} is a Brownian Motion. 

Under certain assumptions one may show that, if the ODE (8) is stable, 
then the stochastic model (1) is stable in a statistical sense, and comparisons 
with (9) are possible under still stronger assumptions (see e.g. [4, 7, 17, 16] 
for results concerning both linear and nonlinear recursions). 

In [23] an alternative point of view was proposed where the stability 
verification problem for (1) is cast in terms of the spectral radius of an 
associated discrete-time semigroup of linear operators. This approach is 
based on the functional analytic setting of [22], and analogous techniques 
arc used in the treatment of multiplicative ergodic theory and spectral theory 
in [2, 14, 15]. The main results of [23] may be interpreted as a significant 
extension of the ODE method for linear recursions. 

Our present results give a unified treatment of both the linear and non- 
linear models treated in [23] and [7], respectively.^ Utilizing the operator- 
theoretic framework developed in [14] also makes it possible to offer a trans- 
parent treatment, and also significantly weaken the assumptions used in 
earlier results. 

We provide answers to the following questions: 

(i) For what range of a > is the random linear system (1) L2-stable, in 

the sense that Ea;[||Xj|p] is bounded in t7 

(ii) What does the averaged model (8) tell us about the behavior of the 

original stochastic model? 

(iii) What is the impact of variability on performance of recursive algo- 
rithms? 

2 Linear Theory 

In this section we develop stability theory and structural results for the 
linear model (1) where a > is a fixed constant. 

It is assumed that an underlying Markov chain with general state- 
space X, governs the statistics of (1) in the sense that M and W are func- 
tions of the Markov chain: 

M, = rn{<^i), Wt = w{^t), t>0. (10) 

^Our results arc given here with only brief proof outlines; a more detailed and complete 
account is in preparation. 
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We assume that the entries of the k x A;-matrix valued function m are 
bounded functions of x G X. Conditions on the vector-valued function w are 
given below. 

We begin with some basic assumptions on required to construct a 
linear operator with useful properties. 



2.1 Some spectral theory 

We assume throughout that the Markov chain ^ is geometrically ergodic 
or, equivalently, V -uniformly ergodic. This is equivalent to assuming the 
validity of the following two conditions: 

Irreducibility & aperiodicity: There exists a cr-finite measure on the state 
space X such that, for any x G X and any measurable ^4 C X with 

P\x, A) := P{$t € A I $(0) =x} >0, for aU sufficiently large t > 0. 

Geometric drift: There exists a Lyapunov function V : X ^ [1, oo), 7 < 1, 
b < 00, to > 1, a 'small set' C, and a 'small measure' u, satisfying 

PV{x) < -fV{x) + bIc{x), xeX 
P*o(a;, ■) > xeC ^ ^ 

Under these assumptions it is known that $ is ergodic and has a unique 
invariant probability measure tt, to which it converges geometrically fast, 
and without loss of generality we can assume that vr(y^) < 00. For a detailed 
development of geometrically ergodic Markov processes sec [21, 22, 14]. 

We let denote the set of measurable vector-valued functions g . X ^ 
C'^ satisfying 

where || • || is the Euclidean norm on C'^, and V : X ^ [1, 00) is the Lyapunov 
function as above. For a linear operator C : — > we define the induced 
operator norm via 

|||/:|||^:=sup||£/||^/||/||y 

where the supremum is over all non-zero / G L^. We say that £ is a 
bounded linear operator if |||jC|||y < 00, and its spectral radius is then given 

by 

lim(|||/:*|||)^/* (12) 
t— >oo 
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The spectrum S{jO,) of the hnear operator £ is 

S{£,):={z € C : {Iz — does not exist as a bdd linear operator on L^}. 

If £ is a finite matrix, its spectrum is just the collection of all its eigenvalues. 
Generally, for the linear operators considered in this paper, the dimension 
of C and its spectrum will be infinite. 

The family of linear operators : L^, a G M, that will be used 

to analyze the recursion (1) are defined by, 

Caf{x) := E[(/-am(cI>i))^/($i)|$(0)=x] 

13 

= E,[(I-«Mi)-/(cI>i)] , 

and we let denote the spectral radius of jCq:. 

We assume throughout the paper that m: X — > M*^^*^ is a bounded func- 
tion. Under these conditions we obtain the following result as in [23] . 

Theorem 2.1 There exists > such that for a G (0, ao), < oo, and 

To ensure that the recursion (1) is stable it is necessary that the spectral 
radius satisfy < 1. Under this condition it is obvious that the mean 
E[Xi] is uniformly bounded in t. The following result summarizes additional 
conclusions obtained below. 

Theorem 2.2 Suppose that the eigenvalues of M := f m{x) 7r{dx) are all 
positive, and that w"^ G L^, where the square is interpreted component-wise. 
Then, there exists a hounded open set O G M containing (0, ao); where ao is 
given in Theorem 2.1, such that: 

(i) For alia ^ O we have ^q, < 1 , and for any initial condition $o = ^ ^ X, 

Ea;[||Xt||^] — >■ (T^ < oo, geometrically fast, as t oo. 

(ii) // ^ is stationary, then for a E O there exists a stationary process X° 

such that for any initial condition = x eX, Xq = 7 G M'^, 

E[||Xi(7) — -'^"Ip] — > 0, geometrically fast, as t ^ 00. 
(Hi) If a ^ O and W is i.i.d. with 7^ then ExlWXtW^] is unbounded. 
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Stability region 



Figure 1: The graph shows how Aq := varies with a. When a. is close to 
Theorem 2.4 below implies that the ODE (8) determines stability of the 
algorithm since it determines whether or not < 1. A second-derivative 
formula is also given in Theorem 2.4: If Aq is large, then the range of a for 
stability will be correspondingly small. 



Proof Outline for Theorem 2.2 Iterating the system equation (5) we 
may express the expectation E^; [X^^-^X^+i] as a sum of terms of the form, 



, j,A; = 0,...,t. (14) 



For simplicity consider the case j = k. Taking conditional expectations at 
time J, one can then express the expectation (14) as 



trace ( 



{Qi-'h{^j))w{^j)w{^jf]) 



where Qa is defined in (19), and h = Ikxk- We define O as the set of a such 
that the spectral radius of this linear operator is strictly less than unity. 
Thus, for a G O we have, for some r/a < 1, 

trace [{Q'-^h {y))w{y)w{yf) = 0{V{yfe-^-^'-^^), = y e X. 

Similar reasoning may be applied for arbitrary A;, j, and this shows that 
E[||Xt|p] is bounded in t > for any deterministic initial conditions $o = 
X G X, Xo = 7 G M'^. 

To construct the stationary process X° we apply backward coupling as 
developed in [28]. Consider the system starting at time — n, initialized at 
7 = 0, and let X"'", t > —n, denote the resulting state trajectory. We then 
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have for all n,m > 1, 



-^a,m _ ^a,n = - aM^)) [Xq"'" - Xq"'"] , t>0, 

i=t 

which imphes convergence in L2 to a stationary process: X":=hm„_>oo ^t''^, 
t>0. We can then compare to the process initiahzed at t = 0, 



- Xtij) = (Jjil - aMi)) [X^ - Xo(7)], t>0, 

i=t 

and the same reasoning as before gives (ii). □ 
2.2 Spectral decompositions 

Next we show that '■= is in fact an eigenvalue of Ca for a range of 
a ~ 0, and we use this fact to obtain a multiphcative ergodic theorem. The 
maximal eigenvalue Aq in Theorem 2.3 is a generalization of the Perron- 
Frobenius eigenvalue; c.f. [27, 14]. 

Theorem 2.3 Suppose that the eigenvalues {Ai(M)} of M are distinct. 
Then, 

(i) There exists £q> Q such that the linear operator Cz has k distinct eigen- 

values {Xi,z, ■ • ■ ) Xk,z} C S{Cz) for all z G B{eq) := {z G C : Iz — 1| < 
eo}; omd Xi^z is an analytic function of z in this domain for each i. 

(ii) For z G -B(£o) there are associated eigenf unctions {/ii,2, . . . , hk^z} C 

and eigenmeasures {^1,2, . . . , ^J'k,z} C M.Y satisfying 

^zhi^z ~ Xi^zhi^zj l^i,z^z ~ Xi^zfJ'i,z ■ 

Moreover, for each i, x eX, A E B{X), {hi^z{x), fJ'i,z{A)} are analytic 
functions on B^eq). 

(iii) Suppose moreover that the eigenvalues {Ai(M)} are real. Then we 

may take eo > sufficiently small so that {Aj^„, h-i^a-, k''i,a} o.^^ 'real for 
a G (0,£o)- The maximal eigenvalue Aq, := maxj Aj^a is equal to (,a, 
and the corresponding eigenfunction and eigenmeasure may be scaled 
so that the following limit holds: 

K^^i ^ha^Ha, t 00, 
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where the convergence is in the V-norm. 

In fact, there exists (5o > and bo < oo such that for any f G the 
following limit holds: 

t y 

Aa*E.[(nU-«^0) fi^t)] =/ia(xK(/) + 6oe-^°V(x). 

i=l 

Proof. The linear operator Cq possesses a /c-dimensional eigenspace 
corresponding to the eigenvalue Aq = 1. This eigenspace is precisely the 
set of constant functions, with a corresponding basis of eigenfunctions given 
by {e*}, where e* is the ith basis clement in R'^. The A;-dimensional set of 
vector- valued eigenmeasures {tt*} given by tt* = e*^7r spans the set of all 
eigenmeasures with eigenvalue Ao,i = 1. 
Consider the linear operator defined by 

Uf (x) = (vr(/i), . . . , vr(A,))^ = E «^ ® ^1 /' / ^ lL 

It is obvious that 11: Ljj^ is a rank-A; linear operator, and for a = 

we have from the ^-uniform ergodic theorem of [21], 

£l-U=[£o- n]* ^0, i ^ oo, 

where the convergence is in norm, and hence takes place exponentially fast. 
It follows that the spectral radius of (Cq — 11) is strictly less than unity. By 
standard arguments it follows that, for some eo > 0, the spectral radius of 
£2—11 is also strictly less than unity. The results then follow as in Theorem 3 
of [15]. □ 

Conditions under which the bound < 1 is satisfied are given in The- 
orem 2.4, where we also provide formulae for the derivatives of A^: 

Theorem 2.4 Suppose that the eigenvalues {Aj(M)} are real and distinct. 
Then, the maximal eigenvalue Aq = satisfies, 

a=U 

(ii) The second derivative is given by. 



a=0 



2 J2 vS^AiMo - M){Mi+i - M)]ro , 



where ro is a right eigenvector of M corresponding to X^in{M), andvo 
is the left eigenvector, normalized so that VQro = 1. 
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(iii) Suppose that m{x) = m'^{x), x G X. Then we may take vq = ro in (ii), 
and the second derivative may be expressed, 



da'' 



a=0 



trace (r - E) 



where an V is the Central Limit Theorem covariance for the stationary 
vector-valued stochastic process = [Mj^ — 'M\vq, and S = Et^IF^F^ 
is its variance. 



Proof. To prove (i), we differentiate the eigenfunction equation jCq^o = 
Aq/iq to obtain 

^a'ha + ^ah'a = Aq/Iq + Xah'^. (15) 

Setting a = then gives a version of Poisson's equation, 

C'oho + Ph'o = X'oho + h'o, (16) 

where CqHo = [— m($i)^/io($i)]. Since Hq G we may integrate both 
sides with respect to the invariant probabihty tt to obtain 

[-m($i)^] ho = -M\o = A'o/io- 

This shows that Ag is an eigenvalue of —M, and ho is an associated eigen- 
vector for M . It follows that Aq = — A^iu(M) by maximality of A^- 

We note that Poisson's equation (16) combined with equation (17.39) of 
[21] implies the formula, 

oo 

h'oix) = E,[/i'o($(0))] - E.[(Mi+i - Mf]ho . (17) 

1=0 

To prove (ii) we consider the second-derivative formula, 

Evaluating these expressions at a = and integrating with respect to tt 
then gives the steady state expression, 

A'o'^o = -2E,[(Mi + X'o)h'oi^i)]. (18) 

In deriving this identity we have used the expressions, 

C'of (x) = E,[Mi/($i)], 47 (x) = 0, / G L^, X G X. 



10 



This combined with (18) gives the desired formula since we may take vq = ho 
in (ii). 

To prove (iii) we simply note that in the symmetric case the formula in 
(ii) becomes, 

A(; = ^ E^[||Ffcf] = trace (r - S) . 
2.3 Second-order statistics 

In order to understand the second-order statistics of X it is convenient to 
introduce another linear operator Qq, as follows, 

Qafix) = E [(/ - am($i)) V(^i)(/ - am($i))|$(0) = x] 

(19) 

= E,[(/-aMi)V($i)(7-aMi)], 

where the domain of Qa is the collection of matrix- valued functions / : X — > 
(j-^fcxfc_ wrjien considering we redefine accordingly. It is clear that 
Qa ■ —>■ is a bounded linear operator under the geometric drift 
condition and the boundedness assumption on m. 

Let denote the spectral radius of Qa- We can again argue that 
is smooth in a neighborhood of the origin, and the following follows as in 
Theorem 2.4: 

Theorem 2.5 Assume that the eigenvalues of M are real and distinct. 
Then there exists eo > such that for each z G -B(eo) there exists an eigen- 
value rjz & C for Qz satisfying |%| = , and rja is real for real a G (0,£o)- 
The eigenvalue % is smooth on B{eo) and satisfies, 

r,'o{Q) = -2K,^{M). 

Proof. This is again based on differentiation of the eigenfunction equa- 
tion given by Qaha = Va^a, where r]a and h^ arc the eigenvalue and matrix- 
valued eigenfunction, respectively. Taking derivatives on both sides gives 

Qa'ha + Qah'a = v'aha + IJah'a (20) 

where Q'^ho = Ex[—m{^iyho{^i) — ho{^i)m{^i)]. As before, we then 
obtain the steady-state expression, 

E^ [-m(^>i)% - homi^i)] = -M^ho - hoM = ri'^ho- (21) 

And, as before, we may conclude that r/g = 2Aq = —2X^i^{M). □ 
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2.4 An illustrative example 

Consider the discrete-time, linear time-varying model 



yt 



h (Pt 



+ Nt, t>0, 



(22) 



where y is a sequence of scalar observations, N = {Nt} is a noise process, 
{9t} is the sequence of fe-dimensional regression vectors, and {4>t} are k- 
dimensional time- varying parameters. In this section we illustrate the results 
above using the LMS (least mean square) parameter estimation algorithm, 

Ot+i = 0t + oi4>tet, 

where e is the error sequence, et ■=^t — dt^t, t > 0. 

As in the Introduction, writing 6t = 9t — dt we obtain 

Ot+i = {I- aMlWt + [Ot+i -Ot- acPtNt] . 

This is of the form (1) with Xt = 0t, Mt = (l)t<Pt and Wt+i = et+i-Ot-ac/itNt. 

For the sake of simplicity and to facilitate explicit numerical calculations, 
we consider the following special case: We assume that is of the form 
(f^t = {st, Sf-i)'^, where the sequence s is Bernoulli (st = ±1 with equal 
probability) and take N to be an i.i.d. noise sequence. 

In analyzing the random linear system we may ignore the noise N and 
take ^ = (f). This is clearly geometrically ergodic since it is an ergodic, 
finite state space Markov chain, with four possible states. In fact, $ is 
geometrically ergodic with Lyapunov function V = 1. Viewing h G as a 
vector in R^, the eigenfunction equation for becomes 



where 








A, 



Ai Ao A2 Aq 

Ai Aq A2 Aq 

Aq A2 Aq Ai 

Aq A2 Aq Ai 

1 — a —a 

—a 1 — a 



ha — ^aha 



(23) 



Ao 



1 — a a 
a 1 — a 



In this case, we have the following local behavior: 
Theorem 2.6 In a neighbor ofO, the spectral radii of Ca, Qa satisfy 



da 



a=0 



a=0 



-A^i„(M) 
0,n > 2; 



da?" 

d" tQ 
n sa 



da 



a=0 



a=0 



-2A^i„(M) 
0,n > 3. 



So Aq, and rja are linear and quadratic around 0, respectively. 
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Proof. This follows from differentiating the respective eigenfunction 
equations. Here we only show the proof for operator Q, the proof for oper- 
ator C is similar. 

Taking derivatives on both sides of the eigenfunction equation for Qa 
gives, 

Qa'ha + Qah'^ = Tj'^ha + (24) 

Setting a = gives a version of Poisson's equation, 

Q'oho + Qh'o = Tj'oho + voK (25) 

Using the identities of /iq and Qq/iq = Mj^'/iq — hoMi], we obtain 
the steady state expression 

M^ho + hoM = -ri'oho. (26) 

Since M = I, we have ri^ = —2. Now, taking the 2nd derivatives on both 
sides of (24) gives, 

Q>a + 2Q>; + Q„< = r]';^h„ + 2r]'^h'^ + r]ahl (27) 

Letting a = and considering the steady state, we obtain 

2M^hoM - 2E^[M^h'o + hM = 4ho + 2r/[,E^[/i'o]. (28) 

Poisson's equation (25) combined with equation (26) and equation (17.39) 
of [21] implies the formula, 

h'^{x) = E^{h',) + j:Zo^.[-Mi^+,ho-hoMi+,-i,ho] 

= ^AK) + E/=o Ea=[(M - Mi+,yho + hoiM - M,+i)]. ^ 

So, from M = I, t]q = —2 and (28) wc have r]^ = 2. In order to show ija 
is quadratic near zero, we take the 3rd derivative on both sides of (27) and 
consider the steady state at a = 0, 

Q'oho + 3Qo/io + ^QM + Qoh'o = Vo'ho + + HK + Voh'o- (30) 

With equation (17.39) of [21] and t/q = —2 and t/q = 2, we can show rj^' = 

(n) 

and r/o — for n > 3, hence rja is quadratic around 0. □ 
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3 Nonlinear models 

We now turn to the nonlinear model shown in (6). We take the special form, 

Xt+i = Xt-a[f{Xt,^t+i) + Wt+i], (31) 

We continue to assume that * is geometrically ergodic, and that Wt = 
wi^t), i > 0, with w'^ eL^. The associated ODE is given by 

|7t=7(7t), (32) 

where 7(7) = / /(7, x) 7r(x), 7 G M'^. 

We assume that W = E^[VFi] = 0, and the following conditions are 
imposed on /: 

(Nl) The function / is Lipschitz, and there exists a function iM.'^ ^M/^ 
such that 

hm r-7(r7)=7oo(7), 7 e K'- 

r—*oo 

Furthermore, the origin in R*^ is an asymptotically stable equilibrium 
point for the ODE, 

^7r=7oo(7r)- (33) 

(N2) There exists 6/ < 00 such that sup \\fij,x) - Ji'y)f < bfV{x), x G 
X. 

(N3) There exists a unique stationary point x* for the ODE (32) that is a 
globally asymptotically stable equilibrium. 

Define the absolute error by 

et:=\\Xt-x*\\, t>0. (34) 

The following result is an extension of Theorem 1 of [7] to Markov models: 

Theorem 3.1 Assume that (N1)-(N3) hold. Then there exists £0 > such 
that for any < a < eo-' 

(i) For any (5 > 0, there exists hi = bi{6) < 00 such that 

lim sup P(£n > <5) < bia. 
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(ii) If the origin is a globally exponentially asymptotically stable equilibrium 
for the ODE (32), then there exists 62 < oo such that for every initial 
condition $0 = a; G X, = 7 G M'^, 

lim sup E[£:^] < 62a- 

n— >oo 

Proof Outline for Theorem 3.1 The continuous-time process {x° : t > 
0} is defined to be the interpolated version of X given as follows: Let Tj = 
j") j > 0; and define x°{Tj) = aXj, with x° defined by linear interpolation 
on the remainder of [T{j),T{j+l)] to form a piecewise linear function. Using 
geometric crgodicity we can bound the error between x° and solutions to 
the ODE (32) as in [7], and we may conclude that the joint process {X, 
is geometrically ergodic with Lyapunov function "^2(7, x) = ||7||^ -|-y(x). □ 

We conclude with an extension of Theorem 2.2 describing the behavior 
of the sensitivity process S. 

Theorem 3.2 Assume that (N1)-(N3) hold, and that the eigenvalues of the 
matrix M have strictly positive real part, where 

M:=V7(x*). 

Then there exists ei > such that for any < a < £1, the conclusions of 
Theorem 3.1 (ii) hold, and, in addition: 

(i) The spectral radius of the random linear system (1) describing the 

evolution of the sensitivity process is strictly less than one. 

(ii) There exists a stationary process X" such that for any initial condition 

$0 = a; G X, Xo = 7 G M^ 

E[\\Xt-X^f]-.0, t^oo. 
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Stiibility region 



Figure 2: The figure on the left shows the Perron-Frobenius eigenvalue A„ = 
^Q- for the LMS model with (j)t = (sf,st_i)^. The figure on the right shows 
the case where (f)t = {st, st-i, st-2)^- In both cases, the sequence s is i.i.d. 
Bernoulli. 
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Figure 3: The maximal eigenvalues t/q, = (ji are piecewise quadratic in a in 
the case where (jjt = (st, st-i)^ with s as above. 
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